162 research outputs found
Faster K-Means Cluster Estimation
There has been considerable work on improving popular clustering algorithm
`K-means' in terms of mean squared error (MSE) and speed, both. However, most
of the k-means variants tend to compute distance of each data point to each
cluster centroid for every iteration. We propose a fast heuristic to overcome
this bottleneck with only marginal increase in MSE. We observe that across all
iterations of K-means, a data point changes its membership only among a small
subset of clusters. Our heuristic predicts such clusters for each data point by
looking at nearby clusters after the first iteration of k-means. We augment
well known variants of k-means with our heuristic to demonstrate effectiveness
of our heuristic. For various synthetic and real-world datasets, our heuristic
achieves speed-up of up-to 3 times when compared to efficient variants of
k-means.Comment: 6 pages, Accepted at ECIR 201
Neural Network Methods for Boundary Value Problems Defined in Arbitrarily Shaped Domains
Partial differential equations (PDEs) with Dirichlet boundary conditions
defined on boundaries with simple geometry have been succesfuly treated using
sigmoidal multilayer perceptrons in previous works. This article deals with the
case of complex boundary geometry, where the boundary is determined by a number
of points that belong to it and are closely located, so as to offer a
reasonable representation. Two networks are employed: a multilayer perceptron
and a radial basis function network. The later is used to account for the
satisfaction of the boundary conditions. The method has been successfuly tested
on two-dimensional and three-dimensional PDEs and has yielded accurate
solutions
Artificial Neural Networks for Solving Ordinary and Partial Differential Equations
We present a method to solve initial and boundary value problems using
artificial neural networks. A trial solution of the differential equation is
written as a sum of two parts. The first part satisfies the boundary (or
initial) conditions and contains no adjustable parameters. The second part is
constructed so as not to affect the boundary conditions. This part involves a
feedforward neural network, containing adjustable parameters (the weights).
Hence by construction the boundary conditions are satisfied and the network is
trained to satisfy the differential equation. The applicability of this
approach ranges from single ODE's, to systems of coupled ODE's and also to
PDE's. In this article we illustrate the method by solving a variety of model
problems and present comparisons with finite elements for several cases of
partial differential equations.Comment: LAtex file, 26 pages, 21 figs, submitted to IEEE TN
Artificial Neural Network Methods in Quantum Mechanics
In a previous article we have shown how one can employ Artificial Neural
Networks (ANNs) in order to solve non-homogeneous ordinary and partial
differential equations. In the present work we consider the solution of
eigenvalue problems for differential and integrodifferential operators, using
ANNs. We start by considering the Schr\"odinger equation for the Morse
potential that has an analytically known solution, to test the accuracy of the
method. We then proceed with the Schr\"odinger and the Dirac equations for a
muonic atom, as well as with a non-local Schr\"odinger integrodifferential
equation that models the system in the framework of the resonating
group method. In two dimensions we consider the well studied Henon-Heiles
Hamiltonian and in three dimensions the model problem of three coupled
anharmonic oscillators. The method in all of the treated cases proved to be
highly accurate, robust and efficient. Hence it is a promising tool for
tackling problems of higher complexity and dimensionality.Comment: Latex file, 29pages, 11 psfigs, submitted in CP
Privacy Preserving Multi-Server k-means Computation over Horizontally Partitioned Data
The k-means clustering is one of the most popular clustering algorithms in
data mining. Recently a lot of research has been concentrated on the algorithm
when the dataset is divided into multiple parties or when the dataset is too
large to be handled by the data owner. In the latter case, usually some servers
are hired to perform the task of clustering. The dataset is divided by the data
owner among the servers who together perform the k-means and return the cluster
labels to the owner. The major challenge in this method is to prevent the
servers from gaining substantial information about the actual data of the
owner. Several algorithms have been designed in the past that provide
cryptographic solutions to perform privacy preserving k-means. We provide a new
method to perform k-means over a large set using multiple servers. Our
technique avoids heavy cryptographic computations and instead we use a simple
randomization technique to preserve the privacy of the data. The k-means
computed has exactly the same efficiency and accuracy as the k-means computed
over the original dataset without any randomization. We argue that our
algorithm is secure against honest but curious and passive adversary.Comment: 19 pages, 4 tables. International Conference on Information Systems
Security. Springer, Cham, 201
Supersymmetric hybrid inflation in the braneworld scenario
In this paper we reconsider the supersymmetric hybrid inflation in the
context of the braneworld scenario . The observational bounds are satisfied
with an inflationary energy scale , without any
fine-tuning of the coupling parameter, provided that the five-dimensional
Planck scale is . We have also
obtained an upper bound on the the brane tension .Comment: 8 pages (Latex
A Comparative Study of Efficient Initialization Methods for the K-Means Clustering Algorithm
K-means is undoubtedly the most widely used partitional clustering algorithm.
Unfortunately, due to its gradient descent nature, this algorithm is highly
sensitive to the initial placement of the cluster centers. Numerous
initialization methods have been proposed to address this problem. In this
paper, we first present an overview of these methods with an emphasis on their
computational efficiency. We then compare eight commonly used linear time
complexity initialization methods on a large and diverse collection of data
sets using various performance criteria. Finally, we analyze the experimental
results using non-parametric statistical tests and provide recommendations for
practitioners. We demonstrate that popular initialization methods often perform
poorly and that there are in fact strong alternatives to these methods.Comment: 17 pages, 1 figure, 7 table
Linear, Deterministic, and Order-Invariant Initialization Methods for the K-Means Clustering Algorithm
Over the past five decades, k-means has become the clustering algorithm of
choice in many application domains primarily due to its simplicity, time/space
efficiency, and invariance to the ordering of the data points. Unfortunately,
the algorithm's sensitivity to the initial selection of the cluster centers
remains to be its most serious drawback. Numerous initialization methods have
been proposed to address this drawback. Many of these methods, however, have
time complexity superlinear in the number of data points, which makes them
impractical for large data sets. On the other hand, linear methods are often
random and/or sensitive to the ordering of the data points. These methods are
generally unreliable in that the quality of their results is unpredictable.
Therefore, it is common practice to perform multiple runs of such methods and
take the output of the run that produces the best results. Such a practice,
however, greatly increases the computational requirements of the otherwise
highly efficient k-means algorithm. In this chapter, we investigate the
empirical performance of six linear, deterministic (non-random), and
order-invariant k-means initialization methods on a large and diverse
collection of data sets from the UCI Machine Learning Repository. The results
demonstrate that two relatively unknown hierarchical initialization methods due
to Su and Dy outperform the remaining four methods with respect to two
objective effectiveness criteria. In addition, a recent method due to Erisoglu
et al. performs surprisingly poorly.Comment: 21 pages, 2 figures, 5 tables, Partitional Clustering Algorithms
(Springer, 2014). arXiv admin note: substantial text overlap with
arXiv:1304.7465, arXiv:1209.196
- …